17 research outputs found

    Wh-Concord in Okinawan = Syntactic Movement + Morphological Merger

    Get PDF
    The main purpose of this paper is to provide a novel account for Wh-Concord in Okinawan based on the Copy Theory of Movement and Distributed Morphology. We propose that Wh-Concord interrogatives and Japanese-type wh-interrogatives have exactly the same derivation in the syntactic component: the Q-particle -ga, base-generated as adjoined to a wh-phrase, undergoes movement to the clause-final position. The two types of interrogatives are distinguished in the post-syntactic component: only in Wh-Concord, the -r morpheme on C0 triggers Morphological Merger, which makes it possible to Spell-Out lower copy of -ga. It is shown that the proposed analysis correctly predicts three descriptive generalizations on the distribution of -ga in (i) syntactic islands, (ii) subordinate clauses, and (iii) (embedded) multiple wh-interrogatives

    Composition, Attention, or Both?

    Full text link
    In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.Comment: Accepted by Findings of EMNLP 202

    JCoLA: Japanese Corpus of Linguistic Acceptability

    Full text link
    Neural language models have exhibited outstanding performance in a range of downstream tasks. However, there is limited understanding regarding the extent to which these models internalize syntactic knowledge, so that various datasets have recently been constructed to facilitate syntactic evaluation of language models across languages. In this paper, we introduce JCoLA (Japanese Corpus of Linguistic Acceptability), which consists of 10,020 sentences annotated with binary acceptability judgments. Specifically, those sentences are manually extracted from linguistics textbooks, handbooks and journal articles, and split into in-domain data (86 %; relatively simple acceptability judgments extracted from textbooks and handbooks) and out-of-domain data (14 %; theoretically significant acceptability judgments extracted from journal articles), the latter of which is categorized by 12 linguistic phenomena. We then evaluate the syntactic knowledge of 9 different types of Japanese language models on JCoLA. The results demonstrated that several models could surpass human performance for the in-domain data, while no models were able to exceed human performance for the out-of-domain data. Error analyses by linguistic phenomena further revealed that although neural language models are adept at handling local syntactic dependencies like argument structure, their performance wanes when confronted with long-distance syntactic dependencies like verbal agreement and NPI licensing

    Psychometric Predictive Power of Large Language Models

    Full text link
    Next-word probabilities from language models have been shown to successfully simulate human reading behavior. Building on this, we show that, interestingly, instruction-tuned large language models (LLMs) yield worse psychometric predictive power (PPP) for human reading behavior than base LLMs with equivalent perplexities. In other words, instruction tuning, which helps LLMs provide human-preferred responses, does not always make them human-like from the computational psycholinguistics perspective. In addition, we explore prompting methodologies in simulating human reading behavior with LLMs, showing that prompts reflecting a particular linguistic hypothesis lead LLMs to exhibit better PPP but are still worse than base LLMs. These highlight that recent instruction tuning and prompting do not offer better estimates than direct probability measurements from base LLMs in cognitive modeling.Comment: 8 page

    Cross-linguistic patterns of morpheme order reflect cognitive biases: An experimental study of case and number morphology

    Full text link
    A foundational goal of linguistics is to investigate whether shared features of the human cognitive system can explain how linguistic patterns are distributed across languages. In this paper we report a series of artificial language learning experiments which aim to test a hypothesised link between cognition and a persistent regularity of morpheme order: number morphemes (e.g., plural markers) tend to be ordered closer to noun stems than case morphemes (e.g., accusative markers) (Universal 39; Greenberg, 1963). We argue that this typological tendency may be driven by learners’ bias towards orders that reflect scopal relationships in morphosyntactic and semantic composition (Bybee, 1985; Rice, 2000; Culbertson & Adger, 2014). This bias is borne out by our experimental results: learners—in the absence of any evidence on how to order number and case morphology—consistently produce number closer to the noun stem. We replicate this effect across two populations (English and Japanese speakers). We also find that it holds independent of morpheme position (prefixal or suffixal), degree of boundedness (free or bound morphology), frequency, and which particular case/number feature values are instantiated in the overt markers (accusative or nominative, plural or singulative). However, we show that this tendency can be reversed when the form of the case marker is made highly dependent on the noun stem, suggesting an influence of an additional bias for local dependencies. Our results provide evidence that universal features of cognition may play a causal role in shaping the relative order of morphemes

    Context Limitations Make Neural Language Models More Human-Like

    Full text link
    Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they compute information-theoretic complexity metrics that simulate humans' cognitive load during reading. This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans. Our results showed that constraining the LMs' context access improved their simulation of human reading behavior. We also showed that LM-human gaps in context access were associated with specific syntactic constructions; incorporating syntactic biases into LMs' context access might enhance their cognitive plausibility.Comment: Accepted by EMNLP2022 (main long

    Design of BCCWJ-EEG : Balanced Corpus with Human Electroencephalography

    Get PDF
    Waseda UniversityNational Institute for Japanese Language and LinguisticsThe past decade has witnessed the happy marriage between natural language processing (NLP) and the cognitive science of language. Moreover, given the historical relationship between biological and artificial neural networks, the advent of deep learning has re-sparked strong interests in the fusion of NLP and the neuroscience of language. Importantly, this inter-fertilization between NLP, on one hand, and the cognitive (neuro)science of language, on the other, has been driven by the language resources annotated with human language processing data. However, there remain several limitations with those language resources on annotations, genres, languages, etc. In this paper, we describe the design of a novel language resource called BCCWJ-EEG, the Balanced Corpus of Contemporary Written Japanese (BCCWJ) experimentally annotated with human electroencephalography (EEG). Specifically, after extensively reviewing the language resources currently available in the literature with special focus on eye-tracking and EEG, we summarize the details concerning (i) participants, (ii) stimuli, (iii) procedure, (iv) data preprocessing, (v) corpus evaluation, (vi) resource release, and (vii) compilation schedule. In addition, potential applications of BCCWJ-EEG to neuroscience and NLP will also be discussed
    corecore